Mining Multi-Dimensional Constrained Gradients in Data Cubes
نویسندگان
چکیده
Constrained gradient analysis (similar to the “cubegrade” problem posed by Imielinski, et al. [9]) is to extract pairs of similar cell characteristics associated with big changes in measure in a data cube. Cells are considered similar if they are related by roll-up, drill-down, or 1-dimensional mutation operation. Constrained gradient queries are expressive, capable of capturing trends in data and answering “what-if” questions. To facilitate our discussion, we call one cell in a gradient pair probe cell and the other gradient cell. An efficient algorithm is developed, which pushes constraints deep into the computation process, finding all gradient-probe cell pairs in one pass. It explores bi-directional pruning between probe cells and gradient cells, utilizing transformed measures and dimensions. Moreover, it adopts a hyper-tree structure and an H-cubing method to compress data and maximize sharing of computation. Our performance study shows that this algorithm is efficient and scalable.
منابع مشابه
Metarule-Guided Mining of Multi-Dimensional Association Rules Using Data Cubes
In this paper, we employ a novel approach to metarule-guided, multi-dimensional association rule mining which explores a data cube structure. We propose algorithms for metarule-guided mining: given a metarule containing p predicates, we compare mining on an n-dimensional (n-D) cube structure (where p < n) with mining on smaller multiple pdimensional cubes. In addition, we propose an efficient m...
متن کاملUsing Data Cubes for Metarule-Guided Mining of Multi-Dimensional Association Rules
Metarule-guided mining is an interactive approach to data mining, where users probe the data under analysis by specifying hypotheses in the form of metarules, or pattern templates. Previous methods for metarule-guided mining of association rules have primarily used a transac-tion/relation table-based structure. Such approaches require costly, multiple scans of the data in order to nd all the la...
متن کاملOMARS: The Framework of an Online Multi-Dimensional Association Rules Mining System
Recently, the integration of data warehouses and data mining has been recognized as the primary platform for facilitating knowledge discovery. Effective data mining from data warehouses, however, needs exploratory data analysis. The users often need to investigate the warehousing data from various perspectives and analyze them at different levels of abstraction. To this end, comprehensive infor...
متن کاملA Parallel Scalable Infrastructure for OLAP and Data Mining
Decision support systems are important in leveraging information present in data warehouses in businesses like banking, insurance, retail and health-care among many others. The multi-dimensional aspects of a business can be naturally expressed using a multi-dimensional data model. Data analysis and data mining on these warehouses pose new challenges for traditional database systems. OLAP and da...
متن کامل